Anomaly Detection Using Self-Organizing Maps-Based K-Nearest Neighbor Algorithm

نویسندگان

  • Jing Tian
  • Michael H. Azarian
  • Michael Pecht
چکیده

Self-organizing maps have been used extensively for condition-based maintenance, where quantization errors of test data referring to the self-organizing maps of healthy training data have been used as features. Researchers have used minimum quantization error as a health indicator, which is sensitive to noise in the training data. Some other researchers have used the average of the quantization errors as a health indicator, where the best matching units of the trained self-organizing maps are required to be convex. These requirements are not always satisfied. This paper introduces a method that improves self-organizing maps for anomaly detection by addressing these issues. Noise dominated best matching units extracted from the map trained by the healthy training data are removed, and the rest are used as healthy references. For a given test data observation, the k-nearest neighbor algorithm is applied to identify neighbors of the observation that occur in the references. Then the Euclidean distance between the test data observation and the centroid of the neighbors is calculated as a health indicator. Compared with the minimum quantization error, the health indicator extracted by this method is less sensitive to noise, and compared with the average of quantization errors, it does not put limitations on the convexity or distribution of the best matching units. The result was validated using data from experiments on cooling fan bearings.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Primary User Boundary Detection in Cognitive Radio Networks: Estimated Secondary User Locations and Impact of Malicious Secondary Users

Cognitive radio is an effective technology to improve spectrum utilization through spectrum sharing among primary users (PUs) and secondary users (SUs). Developing accurate and efficient methods to determine the PU coverage is crucial in order to avoid potential interference from SU to PU. In this paper, we propose to use the estimated location of each SU, which is obtained using the self-organ...

متن کامل

Large-Scale Mapping of Carbon Stocks in Riparian Forests with Self-Organizing Maps and the k-Nearest-Neighbor Algorithm

Among the machine learning tools being used in recent years for environmental applications such as forestry, self-organizing maps (SOM) and the k-nearest neighbor (kNN) algorithm have been used successfully. We applied both methods for the mapping of organic carbon (Corg) in riparian forests due to their considerably high carbon storage capacity. Despite the importance of floodplains for carbon...

متن کامل

Neural-based color image segmentation and classification using self-organizing maps

This paper presents a method for color image segmentation which uses classification to group pixels into regions. The chromaticity is used as data source for the method because it is normalized and considers only hue and saturation, excluding the luminance component. The classification is carried out by means of a self-organizing map (SOM), which is employed to obtain the main chromaticities pr...

متن کامل

Prototype Generation Using Self-Organizing Maps for Informativeness-Based Classifier

The k nearest neighbor is one of the most important and simple procedures for data classification task. The kNN, as it is called, requires only two parameters: the number of k and a similarity measure. However, the algorithm has some weaknesses that make it impossible to be used in real problems. Since the algorithm has no model, an exhaustive comparison of the object in classification analysis...

متن کامل

Processing Missing Values with Self-Organized Maps

This paper introduces modifications of Self-Organizing Maps allowing imputation and classification of data containing missing values. The robustness of the proposed modifications is shown using experimental results of a standard data set. A comparison to modified Fuzzy cluster methods [Timm et al., 2002] is presented. Both methods performed better with available case analysis compared to comple...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014